First, we’ll load libraries and define some functions for colors.
As a reminder, this is our sampling scheme. We sampled legs from 31 female Bombus franklini specimens and performed whole-genome sequencing.
## Reading layer `us_eco_l3_state_boundaries' from data source
## `/Users/rena.schweizer/usda_beelab/projects/bombus_franklini_pop_gen/sample_info/us_eco_l3_state_boundaries/us_eco_l3_state_boundaries.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 1631 features and 13 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -2356069 ymin: 272048.5 xmax: 2258225 ymax: 3172577
## Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic_USGS_version
Let’s examine some sequencing metrics for our data set.
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## lm(formula = meanCov.raw ~ CollectionYr, data = stats_meta)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.765 -14.506 -5.765 10.753 44.131
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -975.0270 458.2269 -2.128 0.0420 *
## CollectionYr 0.5130 0.2316 2.215 0.0348 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 20.96 on 29 degrees of freedom
## Multiple R-squared: 0.1447, Adjusted R-squared: 0.1152
## F-statistic: 4.905 on 1 and 29 DF, p-value: 0.0348
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## lm(formula = uniq.MQ20.PR ~ CollectionYr, data = stats_meta)
##
## Residuals:
## Min 1Q Median 3Q Max
## -60197257 -26269445 -3453216 15997548 73338946
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.533e+09 7.914e+08 -1.938 0.0625 .
## CollectionYr 8.130e+05 4.000e+05 2.032 0.0514 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 36190000 on 29 degrees of freedom
## Multiple R-squared: 0.1247, Adjusted R-squared: 0.09448
## F-statistic: 4.13 on 1 and 29 DF, p-value: 0.05137
## [1] "The mean number of sequence reads is 69128178 with a std dev of 30902666"
## [1] "The mean mapping rate is 94 % with a std dev of 4 %"
## [1] "The mean raw mapping quality is 17 with a std dev of 0"
## [1] "The mean PCR duplicate removal rate is 17 % with a std dev of 8 %"
## [1] "The mean raw coverage is 40 with a std dev of 22"
## [1] "The mean coverage for reads >MQ20 is 27 with a std dev of 16"
## [1] "The mean mapping quality for reads >MQ20 is 41 with a std dev of 1"
While we have a range in genome coverage, we should have high-enough coverage to obtain high-quality genotypes. Additionally, despite using the reference genome of a related species (Bombus affinis), we have a high mapping rate.
Next, let’s look at statistics between some different filtering data sets. The filter sets are:
First, we’ll look at individual depth of coverage.
Next, we’ll look at the fraction of missing data per-individual.
As expected, imposing a filter of max missing data reduces the overall fraction of per-individual missingness. Removing the 6 low-coverage individuals increases the number of called sites at 75%.
Let’s explore the genome-wide heterozygosity. This measure takes into account ALL sequenced sites in the genome and is generally a more accurate measure than SNP heterozygosity that is less susceptible to sample sizes and is more transferrable across studies. https://doi.org/10.1111/2041-210X.13659
Kardos & Waples 2024 suggest mean 15-20x filter to remove positive correlation between heterozygosity and DP. Let’s make sure that we have removed any sort of correlation.
It seems as though our depth of coverage filters and removing 6 low-quality individuals has improved our overall data quality.
Next, we look at Ts/Tv ratios. Usually, a genome-wide rate of ~2 is expected for vertebrates, although less seems to be known about invertebrates. Lozier 2014 Molecular Ecology found a Ts/Tv ratio of ~3.25 in B. impatiens and B. pensylvanicus.
## [1] "The Ts/Tv ratio for Filter 1 is 2.35."
## [1] "The Ts/Tv ratio for Filter 2 is 3.29."
## [1] "The Ts/Tv ratio for Filter 3 is 3.67."
## [1] "The Ts/Tv ratio for Filter 4 is 3.5."
## [1] "The Ts/Tv ratio for Filter 5 is 3.75."
With our more stringent filters, we observe a Ts/Tv ratio for B. franklini that is similar to Lozier et al.
Let’s explore the relationship between genome-wide heterozygosity and collection year.
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## lm(formula = het ~ CollectionYr, data = genomehet_meta[genomehet_meta$myFilter ==
## "F4", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.202e-04 -3.982e-06 2.883e-06 1.459e-05 2.549e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.008e-04 7.117e-04 0.844 0.407
## CollectionYr -1.141e-07 3.595e-07 -0.317 0.754
##
## Residual standard error: 2.952e-05 on 23 degrees of freedom
## Multiple R-squared: 0.004359, Adjusted R-squared: -0.03893
## F-statistic: 0.1007 on 1 and 23 DF, p-value: 0.7539
##
## Call:
## lm(formula = het ~ CollectionYr, data = genomehet_4_meta)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.202e-04 -3.982e-06 2.883e-06 1.459e-05 2.549e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.008e-04 7.117e-04 0.844 0.407
## CollectionYr -1.141e-07 3.595e-07 -0.317 0.754
##
## Residual standard error: 2.952e-05 on 23 degrees of freedom
## Multiple R-squared: 0.004359, Adjusted R-squared: -0.03893
## F-statistic: 0.1007 on 1 and 23 DF, p-value: 0.7539
##
## Call:
## lm(formula = het ~ CollectionYr, data = genomehet_5_meta)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.503e-05 -3.640e-07 2.339e-06 6.642e-06 1.718e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.884e-04 3.785e-04 0.762 0.454
## CollectionYr -4.061e-08 1.912e-07 -0.212 0.834
##
## Residual standard error: 1.57e-05 on 23 degrees of freedom
## Multiple R-squared: 0.001958, Adjusted R-squared: -0.04143
## F-statistic: 0.04513 on 1 and 23 DF, p-value: 0.8336
##
## Wilcoxon rank sum exact test
##
## data: het by decline
## W = 82, p-value = 0.2699
## alternative hypothesis: true location shift is not equal to 0
We do not observe a difference in genome-wide heterozygosity between pre-1995 and 1995 samples.
Now we can compare the genome-wide heterozygosity to a set of values published in Robinson et al. 2016.
## Warning in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,
## : EOF within quoted string
## Warning in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,
## : number of items read is not a multiple of the number of columns
We see that Bombus franklini has genome-wide heterozygosity on the low end of published values.
Let’s look at PCA plots of the data. Here, we will use only F5, allowing for no missing data.
Next, we will confirm/check that our PCA does not qualitatively seem to be driven by depth of coverage, but rather by some biological signal (maybe lat/long?).
## Warning: NAs introduced by coercion
## Warning: NAs introduced by coercion
Next, we’ll look at the admixture results. The most likely K was 1, but we will plot a few K values.
Again, no discernible structuring? Let’s take a look at relatedness among individuals.
We calculated pairwise relatedness using KING in plink. Note that KING kinship coefficients are scaled such that duplicate samples have kinship 0.5, not 1. First-degree relations (parent-child, full siblings) correspond to ~0.25, second-degree relations correspond to ~0.125, etc.
It seems as though there is not much relatedness amongst individuals. This makes sense given that they are from different sampling years and geographic locations. However, BLX3233, which has fairly low heterozygosity does not seem to have almost no kinship with the other samples. Is it worth removing? Seems odd.
Next, we will calculate and visualize runs of homozygosity (ROH). ROH represent a measure of shared ancestry in the population. Long ROHs represent recent inbreeding in populations, while shorter ROH indicate historical founder or bottleneck events. ROH can also demonstrate non-random mating and/or strong selection for a specific genomic region, although the latter should be present in only a fraction of the genome.
We use the software bcftools to calculate ROH using the 75% max-missing data set, with no singletons, for the 18 autosomes of the B. affinis genome.
## Loading file from BCFtools
## [1] "Amongst all individuals, the number of ROH > 1 Mb is 39."
## [1] "Amongst all individuals, the number of ROH between 250 Kb and 1 Mb is 197."
## [1] "Amongst all individuals, the number of ROH > 250 Kb is 236."
## [1] "Current population: 1950"
## [1] "CHR: 1"
## [1] "CHR: 2"
## [1] "CHR: 3"
## [1] "CHR: 4"
## [1] "CHR: 5"
## [1] "CHR: 6"
## [1] "CHR: 7"
## [1] "CHR: 8"
## [1] "CHR: 9"
## [1] "CHR: 10"
## [1] "CHR: 11"
## [1] "CHR: 12"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "CHR: 16"
## [1] "CHR: 17"
## [1] "CHR: 18"
## [1] "Current population: 1967"
## [1] "CHR: 1"
## [1] "CHR: 2"
## [1] "CHR: 3"
## [1] "CHR: 4"
## [1] "CHR: 5"
## [1] "CHR: 6"
## [1] "CHR: 7"
## [1] "CHR: 8"
## [1] "CHR: 9"
## [1] "CHR: 10"
## [1] "CHR: 11"
## [1] "CHR: 12"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "CHR: 16"
## [1] "CHR: 17"
## [1] "CHR: 18"
## [1] "Current population: 1968"
## [1] "CHR: 1"
## [1] "CHR: 2"
## [1] "CHR: 3"
## [1] "CHR: 4"
## [1] "CHR: 5"
## [1] "CHR: 6"
## [1] "CHR: 7"
## [1] "CHR: 8"
## [1] "CHR: 9"
## [1] "CHR: 10"
## [1] "CHR: 11"
## [1] "CHR: 12"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "CHR: 16"
## [1] "CHR: 17"
## [1] "CHR: 18"
## [1] "Current population: 1980"
## [1] "CHR: 1"
## [1] "CHR: 2"
## [1] "CHR: 3"
## [1] "CHR: 4"
## [1] "CHR: 5"
## [1] "CHR: 6"
## [1] "CHR: 7"
## [1] "CHR: 8"
## [1] "CHR: 9"
## [1] "CHR: 10"
## [1] "CHR: 11"
## [1] "CHR: 12"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "CHR: 16"
## [1] "CHR: 17"
## [1] "CHR: 18"
## [1] "Current population: 1990"
## [1] "CHR: 1"
## [1] "CHR: 2"
## [1] "CHR: 3"
## [1] "CHR: 4"
## [1] "CHR: 5"
## [1] "CHR: 6"
## [1] "CHR: 7"
## [1] "CHR: 8"
## [1] "CHR: 9"
## [1] "CHR: 10"
## [1] "CHR: 11"
## [1] "CHR: 12"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "CHR: 16"
## [1] "CHR: 17"
## [1] "CHR: 18"
## [1] "Current population: 1995"
## [1] "CHR: 1"
## [1] "CHR: 2"
## [1] "CHR: 4"
## [1] "CHR: 5"
## [1] "CHR: 6"
## [1] "CHR: 7"
## [1] "CHR: 8"
## [1] "CHR: 9"
## [1] "CHR: 10"
## [1] "CHR: 11"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "CHR: 16"
## [1] "CHR: 17"
## [1] "CHR: 18"
## [1] "Current population: 1998"
## [1] "CHR: 1"
## [1] "CHR: 2"
## [1] "CHR: 3"
## [1] "CHR: 4"
## [1] "CHR: 5"
## [1] "CHR: 6"
## [1] "CHR: 7"
## [1] "CHR: 8"
## [1] "CHR: 9"
## [1] "CHR: 10"
## [1] "CHR: 11"
## [1] "CHR: 12"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "CHR: 16"
## [1] "CHR: 17"
## [1] "CHR: 18"
## [1] "Current population: 1950"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "Current population: 1967"
## [1] "CHR: 2"
## [1] "CHR: 3"
## [1] "CHR: 8"
## [1] "CHR: 11"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "Current population: 1968"
## [1] "CHR: 2"
## [1] "CHR: 15"
## [1] "CHR: 17"
## [1] "Current population: 1980"
## [1] "CHR: 3"
## [1] "CHR: 13"
## [1] "Current population: 1990"
## [1] "CHR: 10"
## [1] "CHR: 12"
## [1] "CHR: 16"
## [1] "Current population: 1998"
## [1] "CHR: 1"
## [1] "CHR: 2"
## [1] "CHR: 3"
## [1] "CHR: 4"
## [1] "CHR: 5"
## [1] "CHR: 6"
## [1] "CHR: 7"
## [1] "CHR: 8"
## [1] "CHR: 9"
## [1] "CHR: 10"
## [1] "CHR: 11"
## [1] "CHR: 13"
## [1] "CHR: 14"
## [1] "CHR: 15"
## [1] "CHR: 16"
## [1] "CHR: 18"
## `summarise()` has grouped output by 'id'. You can override using the `.groups`
## argument.
## [1] 0.01867904
## [1] 0.05726572
From our ROH analysis, it seems as though we have some individuals who have large proportions of their genomes that are >1 Mb and homozygous. There does not seem to be an increase in length of ROH associated with year because we have individuals from 1967 and 1998 that have long ROHs. It does appear as if a couple of individuals have extremely long ROH, while many individuals have shorter ROH. We may want to adjust the sizes of ROH that we look for, but 1 Mb is generally considered recent and 250 Kb reflects more ancient shared ancestry. We can also estimate the TMRCA of different size segments if we assume a recombination rate and generation time in bumble bees. In this case, a 1 Mb length indicates shared ancestry ~5.74 years ago an 250 Kb indicates shared ancestry ~23 years ago.
We calculate genome-wide heterozygosity in 100 kb windows, with heterozygosity as defined in Robinson et al. 2022: “We define heterozygosity as the proportion of all called genotypes within a single individual that are heterozygous. This value represents the average number of pairwise differences per site between homologous chromosomes within a single individual. As we generated VCF files that included invariant positions, the denominator includes all sites with called genotypes, including those that are invariant and match the reference.”
## Warning: Removed 8748 rows containing missing values (`position_stack()`).
## Warning: Removed 8748 rows containing missing values (`position_stack()`).
## Removed 8748 rows containing missing values (`position_stack()`).
## Removed 8748 rows containing missing values (`position_stack()`).
Bar plots of per-site heterozygosity and histograms of heterozygosity per window show that some individuals have extended ROH. Also note the excess number of windows with zero heterozygosity in several individuals indicating windows within large ROH.
We will now examine results from PSMC of high-coverage genomes. Recommended filters for running PSMC are to use individuals with a high mean genome coverage (>18 x) to reduce sampling bias due to low read coverage over regions, as well as filtering for high read quality and mapping quality (q20 and Q20 in bcftools), and only examining chromosome-level sequences (here, the 18 autosomes of B. affinis). I have plotted both the PSMC results for all 19 high-coverage individuals, and for 4 individuals from 4 different sampling time points for ease of viewing.
## [1] 11.87856
## [1] 16.83166
## [1] 8.503777
## [1] 2.477786
## [1] 0.8080004
## [1] 0.03171547
## [1] 2.941955
## [1] 0.2649606
## [1] 9.650723
## [1] 2.316963
## [1] 226387.9
## [1] 24288.46
## mean_geometric
## 1 7.52679
##
## Call:
## lm(formula = Num_unique_pathogen_matches ~ Year, data = blob)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.836 -1.472 -1.190 1.135 9.109
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 110.83398 73.03490 1.518 0.146
## Year -0.05488 0.03688 -1.488 0.154
##
## Residual standard error: 2.87 on 18 degrees of freedom
## Multiple R-squared: 0.1096, Adjusted R-squared: 0.06009
## F-statistic: 2.215 on 1 and 18 DF, p-value: 0.154
##
## Call:
## lm(formula = Pathogen_present ~ Year, data = blob)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.70782 -0.32905 0.06491 0.35215 0.67095
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 25.555488 11.970747 2.135 0.0468 *
## Year -0.012626 0.006044 -2.089 0.0512 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4705 on 18 degrees of freedom
## Multiple R-squared: 0.1951, Adjusted R-squared: 0.1504
## F-statistic: 4.364 on 1 and 18 DF, p-value: 0.05118
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## lm(formula = mean_geometric ~ CollectionYear, data = summary_data_w_year)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.398e-06 -1.879e-06 -4.306e-07 2.092e-06 5.321e-06
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.574e-03 6.474e-05 24.309 <2e-16 ***
## CollectionYear -1.706e-08 3.270e-08 -0.522 0.607
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.686e-06 on 23 degrees of freedom
## Multiple R-squared: 0.0117, Adjusted R-squared: -0.03127
## F-statistic: 0.2722 on 1 and 23 DF, p-value: 0.6068
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = combined_data_het_w_year)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.499e-04 -1.992e-05 8.000e-08 2.008e-05 1.094e-04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.574e-03 7.589e-05 20.740 <2e-16 ***
## CollectionYear -1.706e-08 3.833e-08 -0.445 0.656
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.148e-05 on 2498 degrees of freedom
## Multiple R-squared: 7.932e-05, Adjusted R-squared: -0.000321
## F-statistic: 0.1982 on 1 and 2498 DF, p-value: 0.6563
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = hist_highMut_w_year)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.179e-05 -2.009e-06 -1.522e-06 8.198e-06 2.858e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.157e-04 4.505e-05 9.229 <2e-16 ***
## CollectionYear -1.216e-08 2.275e-08 -0.535 0.593
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.343e-06 on 623 degrees of freedom
## Multiple R-squared: 0.0004584, Adjusted R-squared: -0.001146
## F-statistic: 0.2857 on 1 and 623 DF, p-value: 0.5932
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = bottleneck_choice[bottleneck_choice$strength ==
## "8", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.207e-05 -5.334e-06 -8.800e-07 4.517e-06 1.793e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.875e-04 1.761e-04 3.904 0.000714 ***
## CollectionYear -1.485e-07 8.896e-08 -1.669 0.108672
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.305e-06 on 23 degrees of freedom
## Multiple R-squared: 0.108, Adjusted R-squared: 0.06925
## F-statistic: 2.786 on 1 and 23 DF, p-value: 0.1087
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = bottleneck_choice[bottleneck_choice$strength ==
## "16", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.500e-05 -4.995e-06 2.661e-06 5.005e-06 1.442e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.016e-04 1.584e-04 3.167 0.00431 **
## CollectionYear -5.861e-08 8.001e-08 -0.732 0.47127
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.57e-06 on 23 degrees of freedom
## Multiple R-squared: 0.0228, Adjusted R-squared: -0.01969
## F-statistic: 0.5365 on 1 and 23 DF, p-value: 0.4713
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = bottleneck_choice[bottleneck_choice$strength ==
## "217", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.778e-05 -1.247e-06 -3.533e-07 4.117e-06 1.076e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.680e-04 1.947e-04 0.863 0.397
## CollectionYear 1.118e-07 9.834e-08 1.136 0.267
##
## Residual standard error: 8.075e-06 on 23 degrees of freedom
## Multiple R-squared: 0.05317, Adjusted R-squared: 0.012
## F-statistic: 1.292 on 1 and 23 DF, p-value: 0.2675
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = bottleneck_choice[bottleneck_choice$strength ==
## "1932", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.557e-05 -5.623e-06 4.360e-06 4.428e-06 1.443e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.913e-04 1.892e-04 2.068 0.050 .
## CollectionYear 2.195e-09 9.555e-08 0.023 0.982
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.846e-06 on 23 degrees of freedom
## Multiple R-squared: 2.294e-05, Adjusted R-squared: -0.04345
## F-statistic: 0.0005276 on 1 and 23 DF, p-value: 0.9819
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = bottleneck_choice[bottleneck_choice$strength ==
## "64000", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.272e-04 -6.514e-05 1.002e-05 7.014e-05 1.119e-04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.598e-03 2.141e-03 4.015 0.000542 ***
## CollectionYear -4.176e-06 1.082e-06 -3.861 0.000795 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.882e-05 on 23 degrees of freedom
## Multiple R-squared: 0.3932, Adjusted R-squared: 0.3668
## F-statistic: 14.9 on 1 and 23 DF, p-value: 0.0007946
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = combined_data_het_pathogen_w_year_bottlenecks_reps[combined_data_het_pathogen_w_year_bottlenecks_reps$strength ==
## "8", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.134e-04 -2.342e-05 -3.423e-06 1.657e-05 8.655e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.546e-03 1.373e-04 11.255 <2e-16 ***
## CollectionYear -1.080e-09 6.936e-08 -0.016 0.988
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.848e-05 on 623 degrees of freedom
## Multiple R-squared: 3.888e-07, Adjusted R-squared: -0.001605
## F-statistic: 0.0002422 on 1 and 623 DF, p-value: 0.9876
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = combined_data_het_pathogen_w_year_bottlenecks_reps[combined_data_het_pathogen_w_year_bottlenecks_reps$strength ==
## "16", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.833e-05 -1.912e-05 3.530e-07 2.132e-05 8.153e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.626e-03 1.427e-04 11.396 <2e-16 ***
## CollectionYear -4.379e-08 7.206e-08 -0.608 0.544
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.959e-05 on 623 degrees of freedom
## Multiple R-squared: 0.0005924, Adjusted R-squared: -0.001012
## F-statistic: 0.3693 on 1 and 623 DF, p-value: 0.5436
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = combined_data_het_pathogen_w_year_bottlenecks_reps[combined_data_het_pathogen_w_year_bottlenecks_reps$strength ==
## "217", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.638e-04 -1.775e-05 2.251e-06 2.390e-05 9.238e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.787e-03 1.732e-04 10.317 <2e-16 ***
## CollectionYear -1.266e-07 8.748e-08 -1.447 0.148
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.592e-05 on 623 degrees of freedom
## Multiple R-squared: 0.00335, Adjusted R-squared: 0.00175
## F-statistic: 2.094 on 1 and 623 DF, p-value: 0.1484
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = combined_data_het_pathogen_w_year_bottlenecks_reps[combined_data_het_pathogen_w_year_bottlenecks_reps$strength ==
## "1932", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.122e-04 -1.420e-05 7.150e-06 3.244e-05 9.776e-05
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.851e-03 3.061e-04 9.311 < 2e-16 ***
## CollectionYear -6.648e-07 1.546e-07 -4.299 1.99e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.349e-05 on 623 degrees of freedom
## Multiple R-squared: 0.02881, Adjusted R-squared: 0.02725
## F-statistic: 18.48 on 1 and 623 DF, p-value: 1.99e-05
##
## Call:
## lm(formula = Heterozygosity ~ CollectionYear, data = combined_data_het_pathogen_w_year_bottlenecks_reps[combined_data_het_pathogen_w_year_bottlenecks_reps$strength ==
## "64000", ])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.009e-03 -2.398e-04 4.881e-05 3.132e-04 5.328e-04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.605e-02 1.881e-03 19.16 <2e-16 ***
## CollectionYear -1.756e-05 9.504e-07 -18.48 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0003902 on 623 degrees of freedom
## Multiple R-squared: 0.3541, Adjusted R-squared: 0.353
## F-statistic: 341.5 on 1 and 623 DF, p-value: < 2.2e-16